Probabilistic Inference in Machine Learning. MSc in Data Science Methodology. Barcelona School of Economics.
Supervisor: Prof. Lorenzo Cappello, PhD
Importing the relevant libraries:
library(rstanarm)
Loading required package: Rcpp
This is rstanarm version 2.32.1
- See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
- Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
- For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores())
library(arm)
Loading required package: MASS
Loading required package: Matrix
Loading required package: lme4
arm (Version 1.14-4, built: 2024-4-1)
Working directory is /home/oliver/Documents/Term 2/Propabilistic Inference/Bayesian_Algorithm
Attaching package: 'arm'
The following objects are masked from 'package:rstanarm':
invlogit, logit
library(ggplot2)
library(dplyr)
Attaching package: 'dplyr'
The following object is masked from 'package:MASS':
select
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(rstan)
Loading required package: StanHeaders
rstan version 2.32.6 (Stan version 2.32.2)
For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
change `threads_per_chain` option:
rstan_options(threads_per_chain = 1)
Attaching package: 'rstan'
The following object is masked from 'package:arm':
traceplot
library(bayesplot)
This is bayesplot version 1.11.1
- Online documentation and vignettes at mc-stan.org/bayesplot
- bayesplot theme set to bayesplot::theme_default()
* Does _not_ affect other ggplot2 plots
* See ?bayesplot_theme_set for details on theme setting
We set options for rstan to use multiple cores for faster computation:
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
Clear the environment:
rm(list=ls())
In this project, we perform a hierarchical Bayesian analysis of the radon dataset. Our objective is to explore the radon levels in houses across various counties and understand how the levels vary both within and between counties. Inspired by Gelman and Hill’s work on hierarchical models, we build on this foundation by implementing an enhanced Hamiltonian Monte Carlo approach in Stan. This method is expected to provide efficient and robust posterior inference for our multilevel model.
In this section, we explore the radon dataset to understand its structure and key variables.
First, we load the radon dataset. We can do this by using the load method as the dataset is included in the rstan package. Alternatively, we added the file as a CVS in the Project.zip:
data("radon")
#radon<-read.csv("radon_exported.csv") if the above gives errors due to package incompatibility.
Let’s inspect the first few rows:
head(radon)
floor county log_radon log_uranium
1 1 AITKIN 0.83290912 -0.6890476
2 0 AITKIN 0.83290912 -0.6890476
3 0 AITKIN 1.09861229 -0.6890476
4 0 AITKIN 0.09531018 -0.6890476
5 0 ANOKA 1.16315081 -0.8473129
6 0 ANOKA 0.95551145 -0.8473129
The dataset contains the following variables:
\(floor\): Indicates whether the measurement was taken in a basement (coded as 0) or on an upper floor (coded as 1).
\(county\): Identifies the county for each observation.
\(log_radon\): The natural logarithm of the radon measurement. This transformation is applied to stabilize variance and approach normality in the distribution of radon levels.
\(log_uranium\): The natural logarithm of the uranium measurement. Uranium concentration in the soil is considered a potential predictor of radon levels, as radon is a decay product of uranium.
We also check the dimensions of the dataset:
dim(radon)
[1] 919 4
This dataset consists of 919 observations.
To understand the distribution of radon levels, we visualize the histogram of the log-transformed radon measurements:
ggplot(radon, aes(x = log_radon)) +
geom_histogram(bins = 30, fill = "skyblue", color = "black") +
labs(title = "Distribution of Log Radon Levels", x = "Log Radon", y = "Frequency")
The distribution of log-transformed radon levels is roughly unimodal,
centered around 1 to 2 on the log scale, and shows a moderate right
tail. The log transformation reduces skew compared to raw radon
measurements and helps stabilize variance, making it more amenable to
hierarchical modeling.
Next, we explore how radon levels vary by county. A boxplot of log_radon across counties provides a clear visualization of the differences:
ggplot(radon, aes(x = factor(county), y = log_radon)) +
geom_boxplot(fill = "lightgreen", color = "black") +
labs(title = "Radon Levels by County", x = "County", y = "Log Radon") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
There is clear variation in log radon levels across counties, with some
counties showing higher or lower median values than others. Within each
county, there is also notable spread in the data. This suggests both
between-county and within-county variability and reinforces the need for
a hierarchical modeling approach that can capture these different levels
of variation.
Additionally, we also compute and plot the mean log radon level for each county:
county_summary <- radon %>%
group_by(county) %>%
summarise(mean_log_radon = mean(log_radon, na.rm = TRUE),
n = n())
ggplot(county_summary, aes(x = reorder(county, mean_log_radon), y = mean_log_radon)) +
geom_point() +
coord_flip() +
scale_x_discrete(expand = expansion(mult = c(0.05, 0.05))) +
labs(title = "Mean Log Radon Levels by County",
x = "County",
y = "Mean Log Radon") +
theme_minimal() +
theme(axis.text.y = element_text(size = 4))
Again, this shows that average radon levels vary noticeably across
counties, with some counties having much lower mean log radon than
others. The clear gradient from left to right indicates substantial
between-county variation, which supports using a hierarchical model to
capture these differences alongside the within-county variability.
An important predictor in our dataset is the \(floor\) variable, which indicates if the measurement is from a basement (0) or an upper floor (1). We examine how radon levels differ based on this predictor:
ggplot(radon, aes(x = factor(floor), y = log_radon)) +
geom_boxplot(fill = "orange", color = "black") +
labs(title = "Radon Levels by Floor (Basement vs Upper Floor)",
x = "Floor (0 = Basement, 1 = Upper Floor)", y = "Log Radon")
We can see that homes measured in the basement tend to have higher radon
levels than those measured on an upper floor. Although there is overlap
between the two groups, the boxplot shows a higher median and upper
quartile for basement measurements, which aligns with the idea that
radon gas often enters through the ground and accumulates more in lower
levels of a building.
Note: Additionally, including log_uranium as a predictor in the model allows us to account for potential covariate effects that might explain some of the variability in radon levels.
The radon data are naturally grouped by county, meaning that multiple observations (houses) are nested within each county. This leads to two sources of variability:
Within-County Variability: Variation in radon levels among houses within the same county.
Between-County Variability: Variation in the average radon level from one county to another.
To further illustrate this structure, we summarize key statistics by county:
county_stats <- radon %>%
group_by(county) %>%
summarise(n = n(),
mean_log_radon = mean(log_radon, na.rm = TRUE),
sd_log_radon = sd(log_radon, na.rm = TRUE))
county_stats
# A tibble: 85 × 4
county n mean_log_radon sd_log_radon
<fct> <int> <dbl> <dbl>
1 AITKIN 4 0.715 0.432
2 ANOKA 52 0.891 0.718
3 BECKER 3 1.09 0.717
4 BELTRAMI 7 1.19 0.894
5 BENTON 4 1.28 0.415
6 BIGSTONE 3 1.54 0.504
7 BLUEEARTH 14 1.93 0.542
8 BROWN 4 1.65 0.595
9 CARLTON 10 0.977 0.585
10 CARVER 6 1.22 1.90
# ℹ 75 more rows
The table shows that each of the 85 counties has a different number of observations (n) and distinct mean and standard deviation of log radon levels. Some counties have very few data points (e.g., 3 or 4), while others have many more (e.g., 52). The mean log radon values also vary considerably across counties, as do their standard deviations. This variation in both sample size and radon levels by county underscores the need for a hierarchical model that can account for county-to-county differences (random effects) while borrowing strength across counties.
We aim to model the log-transformed radon level \(y_{ij}\) for house \(i\) in county \(j\). We include a fixed effect for whether the measurement was taken in the basement (0) or on an upper floor (1). In addition, each county has its own intercept, reflecting the idea that some counties may have generally higher or lower radon levels. Note that this model is inspired by the hierarchical models presented by Gelman and Hill (2007). We can write this as:
\[ \begin{aligned} y_{ij} &\sim \mathcal{N}(\alpha_j + \beta \, x_{ij}, \sigma^2), \\[6pt] \alpha_j &\sim \mathcal{N}(\mu_\alpha, \tau^2), \end{aligned} \] where:
\(y_{ij}\) is the log radon measurement for house \(i\) in county \(j\).
\(x_{ij}\) is the floor variable (0 = basement, 1 = upper floor).
\(\alpha_j\) is the county-specific intercept, which follows a normal distribution centered at \(\mu_\alpha\) with standard deviation \(\tau\).
\(\beta\) is the fixed effect for the floor predictor.
\(\sigma\) is the within-county (house-level) residual standard deviation.
\(\mu_\alpha\) is the overall (global) intercept across all counties.
\(\tau\) captures the between-county variability.
To specify our model, we used the following distributions:
\[ \begin{aligned} y_{ij} &\sim \mathcal{N}(\alpha_j + \beta \, x_{ij}, \sigma^2), \\[6pt] \alpha_j &\sim \mathcal{N}(\mu_\alpha, \tau^2), \\[6pt] \mu_\alpha &\sim \mathcal{N}(0, 10^2), \quad \tau &\sim \text{Cauchy}^+(0, 2.5), \\[6pt] \beta &\sim \mathcal{N}(0, 10^2), \quad \sigma \sim \text{Cauchy}^+(0, 2.5). \end{aligned} \]
We shall denote this model our baseline model. The corresponding plate diagram can be seen in the file baseline_model.jpeg.
The goal of this Bayesian hierarchical model is to estimate the posterior distributions of all unknown parameters:
\(\beta\) (effect of basement vs. upper floor),
\(\mu_\alpha\) (overall average intercept),
\(\tau\) (between-county variability),
\(\sigma\) (within-county variability), and
each county-specific intercept \(\alpha_j\).
By estimating these posterior distributions, we can quantify both the uncertainty at the county level (how counties differ from one another) and the uncertainty at the house level (residual variability around each county’s intercept).
We will implement our baseline model in the chapter/code below. the corresponding model can be found in the file radon_model_centered.stan We load our Stan model (saved as radon_model_centered.stan), prepare the data for modeling, run the posterior sampler using Hamiltonian Monte Carlo (HMC) via the No-U-Turn Sampler (NUTS), and perform several diagnostic checks. NUTS is particularly effective for hierarchical models, as it uses gradient information to efficiently explore high-dimensional posterior distributions.
We first prepare the data in a list that matches the format expected by the Stan model. Note that we convert the county variable to an integer to serve as an index for the hierarchical structure.
radon_data <- list(
N = nrow(radon),
J = length(unique(radon$county)),
county = as.integer(radon$county),
y = radon$log_radon,
x = radon$floor
)
print(radon_data)
$N
[1] 919
$J
[1] 85
$county
[1] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[26] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[51] 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 4 5 5 5 5 6 6 6 7 7
[76] 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9
[101] 9 10 10 10 10 10 10 11 11 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14
[126] 14 14 14 14 14 14 14 14 14 14 14 15 15 15 15 16 16 17 17 17 17 18 18 18 18
[151] 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19
[176] 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19
[201] 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 20 20 20 21
[226] 21 21 21 21 21 21 21 21 22 22 22 22 22 22 23 23 24 24 24 24 24 24 24 24 24
[251] 25 25 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26
[276] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[301] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[326] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[351] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27
[376] 28 28 28 28 28 29 29 29 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 32
[401] 32 32 32 33 33 33 33 34 34 34 35 35 35 35 35 35 35 36 36 37 37 37 37 37 37
[426] 37 37 37 38 38 38 38 39 39 39 39 39 40 40 40 40 41 41 41 41 41 41 41 41 45
[451] 45 45 45 45 45 45 45 45 45 45 45 45 42 43 43 43 43 43 43 43 43 43 44 44 44
[476] 44 44 44 44 46 46 46 46 46 47 47 48 48 48 48 48 48 48 48 48 49 49 49 49 49
[501] 49 49 49 49 49 49 49 49 50 51 51 51 51 52 52 52 53 53 53 54 54 54 54 54 54
[526] 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 55 55 55 55 55 55 55 55
[551] 56 56 56 57 57 57 57 57 57 58 58 58 58 59 59 59 59 60 60 61 61 61 61 61 61
[576] 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61
[601] 61 62 62 62 62 62 63 63 63 64 64 64 64 64 64 64 64 64 64 64 65 65 66 66 66
[626] 66 66 66 66 66 66 66 66 66 66 66 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[651] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[676] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[701] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[726] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[751] 70 70 67 67 67 67 67 67 67 67 67 67 67 67 67 68 68 68 68 68 68 68 68 69 69
[776] 69 69 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71
[801] 71 71 72 72 72 72 72 72 72 72 72 72 73 73 74 74 74 74 75 75 75 76 76 76 76
[826] 77 77 77 77 77 77 77 78 78 78 78 78 79 79 79 79 80 80 80 80 80 80 80 80 80
[851] 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80
[876] 80 80 80 80 80 80 80 80 80 80 80 80 81 81 81 82 83 83 83 83 83 83 83 83 83
[901] 83 83 83 83 84 84 84 84 84 84 84 84 84 84 84 84 84 85 85
$y
[1] 0.83290912 0.83290912 1.09861229 0.09531018 1.16315081 0.95551145
[7] 0.47000363 0.09531018 -0.22314355 0.26236426 0.26236426 0.33647224
[13] 0.40546511 -0.69314718 0.18232156 1.52605630 0.33647224 0.78845736
[19] 1.79175947 1.22377543 0.64185389 1.70474809 1.85629799 0.69314718
[25] 1.90210753 1.16315081 1.93152141 1.96009478 2.05412373 1.66770682
[31] 1.52605630 1.50407740 1.06471074 2.10413415 0.53062825 1.45861502
[37] 1.70474809 1.41098697 0.87546874 1.09861229 0.40546511 1.22377543
[43] 1.09861229 0.64185389 -1.20397280 0.91629073 0.18232156 0.83290912
[49] -0.35667494 0.58778666 1.09861229 0.83290912 0.58778666 0.40546511
[55] 0.69314718 0.64185389 0.26236426 1.48160454 1.52605630 1.85629799
[61] 1.54756251 1.75785792 0.83290912 -0.69314718 1.54756251 1.50407740
[67] 1.90210753 1.02961942 1.09861229 1.09861229 1.98787435 1.62924054
[73] 0.99325177 1.62924054 2.57261223 1.98787435 1.93152141 2.55722731
[79] 1.77495235 2.26176310 1.80828877 1.36097655 2.66722821 0.64185389
[85] 1.94591015 1.56861592 2.26176310 0.95551145 1.91692261 1.41098697
[91] 2.32238772 0.83290912 0.64185389 1.25276297 1.74046617 1.48160454
[97] 1.38629436 0.33647224 1.45861502 -0.10536052 0.74193734 0.53062825
[103] 2.56494936 2.69462718 1.56861592 2.27212589 -2.30258509 1.33500107
[109] 2.01490302 0.69314718 1.68639895 1.41098697 2.05412373 0.40546511
[115] 2.31253542 2.25129180 -0.10536052 1.50407740 1.62924054 0.78845736
[121] 0.58778666 2.10413415 0.00000000 2.56494936 0.99325177 1.28093385
[127] 3.28466357 0.47000363 2.57261223 2.18605128 2.97552957 0.95551145
[133] 2.20827441 2.58021683 1.30833282 1.94591015 1.58923521 1.25276297
[139] 0.00000000 1.25276297 1.02961942 0.40546511 1.93152141 2.41591378
[145] -2.30258509 0.95551145 0.64185389 0.53062825 0.09531018 0.00000000
[151] 1.09861229 1.50407740 0.47000363 1.43508453 0.95551145 1.91692261
[157] 1.48160454 1.72276660 1.30833282 1.06471074 2.68784749 1.91692261
[163] 2.09186406 0.99325177 1.06471074 1.50407740 0.58778666 0.74193734
[169] 0.74193734 0.47000363 2.27212589 2.10413415 1.28093385 -0.10536052
[175] 1.64865863 1.19392247 2.38876279 2.11625551 1.85629799 1.58923521
[181] 1.80828877 0.18232156 2.17475172 2.18605128 1.93152141 0.87546874
[187] 0.53062825 1.06471074 1.88706965 0.58778666 1.54756251 1.22377543
[193] 1.50407740 3.05870707 2.21920348 0.00000000 1.60943791 1.62924054
[199] 0.18232156 2.04122033 1.70474809 1.30833282 1.60943791 1.56861592
[205] 0.40546511 1.25276297 1.45861502 0.95551145 0.40546511 0.40546511
[211] 0.69314718 1.58923521 0.40546511 1.36097655 2.18605128 1.48160454
[217] 1.50407740 1.52605630 0.83290912 -0.51082562 1.77495235 1.70474809
[223] 1.98787435 1.75785792 2.01490302 1.58923521 1.93152141 1.87180218
[229] 1.33500107 1.72276660 2.06686276 1.50407740 1.02961942 1.25276297
[235] 1.45861502 0.87546874 0.33647224 1.66770682 -1.60943791 0.95551145
[241] 1.19392247 1.19392247 2.27212589 1.45861502 2.20827441 1.85629799
[247] 3.48737508 2.58776404 0.83290912 1.74046617 2.66722821 1.94591015
[253] 2.04122033 2.29253476 0.99325177 3.77505715 1.60943791 1.60943791
[259] 1.28093385 1.58923521 1.74046617 1.28093385 1.38629436 1.91692261
[265] 2.07944154 1.22377543 0.78845736 0.53062825 1.41098697 0.64185389
[271] 0.95551145 2.42480273 0.99325177 1.38629436 2.01490302 0.33647224
[277] 0.00000000 -0.69314718 0.95551145 1.80828877 0.74193734 1.70474809
[283] 1.13140211 1.09861229 1.72276660 1.43508453 1.38629436 2.70805020
[289] 1.98787435 0.87546874 1.06471074 1.50407740 0.47000363 2.16332303
[295] 1.74046617 2.16332303 1.36097655 0.64185389 0.69314718 1.72276660
[301] 0.95551145 -0.10536052 0.78845736 1.06471074 1.38629436 1.48160454
[307] 1.56861592 1.06471074 1.43508453 0.53062825 1.48160454 -0.22314355
[313] 1.72276660 1.22377543 1.72276660 0.95551145 1.02961942 2.14006616
[319] 1.22377543 1.19392247 2.16332303 0.58778666 1.75785792 2.57261223
[325] 1.02961942 1.56861592 1.74046617 2.63188884 2.04122033 1.75785792
[331] 1.54756251 2.04122033 0.99325177 1.52605630 1.79175947 0.83290912
[337] 0.91629073 1.41098697 1.54756251 1.54756251 2.39789527 2.04122033
[343] 1.13140211 0.47000363 0.53062825 2.80940270 1.16315081 1.64865863
[349] 1.60943791 1.80828877 0.00000000 0.64185389 1.38629436 1.74046617
[355] -0.69314718 0.99325177 1.30833282 1.84054963 3.16547505 1.38629436
[361] 1.09861229 1.13140211 1.56861592 1.13140211 1.45861502 1.36097655
[367] 1.13140211 1.48160454 1.09861229 1.25276297 2.15176220 2.20827441
[373] 1.58923521 1.30833282 0.83290912 1.06471074 -0.10536052 0.47000363
[379] 1.54756251 1.33500107 1.30833282 1.13140211 0.83290912 0.69314718
[385] 0.99325177 0.64185389 0.91629073 1.48160454 0.99325177 0.18232156
[391] 1.22377543 0.95551145 2.25129180 0.33647224 2.14006616 1.62924054
[397] 1.09861229 2.58021683 2.73436751 0.64185389 1.36097655 2.07944154
[403] 0.99325177 2.43361336 1.43508453 2.51769647 1.91692261 1.94591015
[409] 1.52605630 0.00000000 0.58778666 0.40546511 0.74193734 0.09531018
[415] 0.09531018 1.06471074 0.33647224 2.43361336 2.77881927 0.33647224
[421] 0.33647224 0.53062825 0.00000000 1.06471074 -0.51082562 0.47000363
[427] 1.97408103 -0.51082562 2.32238772 1.48160454 1.22377543 1.09861229
[433] 2.53369681 1.45861502 1.52605630 1.38629436 1.22377543 2.86789890
[439] 2.37024374 2.07944154 1.28093385 1.88706965 1.94591015 1.64865863
[445] 2.49320545 1.64865863 2.19722458 1.77495235 1.54756251 2.34180581
[451] 1.38629436 0.64185389 2.30258509 0.87546874 1.50407740 1.06471074
[457] 0.18232156 0.26236426 0.53062825 3.23867845 -2.30258509 2.37024374
[463] 1.38629436 0.47000363 3.17387846 0.00000000 0.40546511 0.18232156
[469] 1.06471074 3.87743156 0.00000000 2.12823171 1.43508453 -0.51082562
[475] 1.91692261 2.02814825 2.23001440 -0.51082562 0.47000363 0.87546874
[481] 1.38629436 1.98787435 0.78845736 1.19392247 -0.51082562 1.75785792
[487] 0.40546511 0.78845736 1.50407740 0.91629073 1.60943791 1.13140211
[493] 1.13140211 1.06471074 1.38629436 2.39789527 1.87180218 0.74193734
[499] 1.13140211 1.52605630 0.78845736 2.09186406 0.33647224 2.23001440
[505] 0.18232156 2.37024374 3.18221184 2.21920348 2.50143595 2.10413415
[511] 2.38876279 1.45861502 2.76000994 1.70474809 1.84054963 2.28238239
[517] 2.10413415 0.53062825 0.53062825 1.87180218 1.50407740 2.42480273
[523] 2.31253542 1.52605630 2.09186406 0.87546874 1.19392247 1.62924054
[529] 1.43508453 0.18232156 0.74193734 0.18232156 1.09861229 0.78845736
[535] 2.06686276 1.36097655 0.95551145 1.09861229 0.58778666 0.95551145
[541] 2.25129180 -0.35667494 1.02961942 0.18232156 0.78845736 2.49320545
[547] 2.54160199 1.19392247 1.45861502 1.36097655 1.33500107 1.77495235
[553] -0.91629073 1.43508453 1.06471074 0.69314718 0.26236426 0.26236426
[559] 0.47000363 2.25129180 0.58778666 2.50143595 1.48160454 1.94591015
[565] 0.40546511 0.95551145 2.27212589 1.36097655 1.25276297 1.93152141
[571] 1.30833282 0.83290912 0.99325177 0.78845736 1.96009478 0.26236426
[577] 1.36097655 1.28093385 1.45861502 0.53062825 1.06471074 2.16332303
[583] 1.84054963 1.66770682 1.02961942 0.26236426 1.28093385 1.72276660
[589] 2.32238772 1.72276660 0.26236426 1.60943791 1.41098697 1.28093385
[595] 0.95551145 0.26236426 1.02961942 0.58778666 1.16315081 -0.22314355
[601] 0.09531018 0.69314718 1.36097655 2.19722458 2.01490302 3.03495299
[607] 1.80828877 0.78845736 1.77495235 2.28238239 1.87180218 1.54756251
[613] 1.74046617 2.94968834 0.91629073 1.13140211 1.64865863 2.05412373
[619] 2.10413415 1.56861592 2.14006616 0.53062825 1.80828877 0.18232156
[625] 2.44234704 1.48160454 1.30833282 2.34180581 1.25276297 1.16315081
[631] 1.30833282 1.02961942 1.41098697 0.26236426 0.58778666 1.45861502
[637] -0.10536052 -0.51082562 0.91629073 0.87546874 1.54756251 2.40694511
[643] 2.70805020 2.16332303 1.52605630 0.47000363 1.38629436 0.64185389
[649] 0.53062825 -0.51082562 -0.69314718 -0.51082562 2.17475172 0.53062825
[655] 0.40546511 2.17475172 2.41591378 0.47000363 0.18232156 0.00000000
[661] -0.22314355 1.45861502 1.25276297 0.78845736 1.09861229 0.64185389
[667] 0.64185389 0.91629073 0.58778666 -0.10536052 2.46809953 0.64185389
[673] 1.06471074 1.28093385 1.30833282 1.28093385 1.13140211 1.19392247
[679] 1.16315081 1.22377543 0.58778666 1.74046617 1.25276297 0.47000363
[685] 3.47506723 0.18232156 0.78845736 -0.10536052 0.47000363 0.33647224
[691] 1.16315081 1.98787435 0.40546511 0.33647224 0.47000363 1.62924054
[697] 0.87546874 0.91629073 0.26236426 1.70474809 0.18232156 0.40546511
[703] 1.98787435 0.18232156 1.22377543 1.19392247 0.47000363 1.30833282
[709] -0.10536052 0.53062825 0.40546511 1.02961942 1.22377543 0.00000000
[715] -0.35667494 0.74193734 0.69314718 0.00000000 1.70474809 0.47000363
[721] 1.16315081 0.64185389 0.00000000 1.22377543 0.58778666 1.16315081
[727] -0.22314355 1.48160454 0.40546511 0.64185389 0.47000363 0.83290912
[733] 0.91629073 1.02961942 0.58778666 0.18232156 0.64185389 -1.20397280
[739] 0.83290912 1.54756251 0.78845736 0.74193734 -0.22314355 1.87180218
[745] 1.13140211 0.74193734 0.00000000 1.22377543 0.64185389 0.64185389
[751] 0.83290912 1.48160454 2.96527307 2.21920348 0.74193734 2.44234704
[757] 2.33214390 0.78845736 0.26236426 1.19392247 0.74193734 1.48160454
[763] 0.83290912 1.70474809 3.23080440 1.64865863 0.87546874 1.19392247
[769] 0.95551145 1.06471074 1.16315081 0.53062825 1.56861592 1.41098697
[775] 1.62924054 0.47000363 1.58923521 2.02814825 1.87180218 2.12823171
[781] 0.78845736 1.22377543 0.33647224 1.62924054 0.09531018 1.96009478
[787] 1.75785792 2.32238772 1.90210753 0.99325177 1.22377543 0.47000363
[793] 1.62924054 2.01490302 2.68102153 0.64185389 2.01490302 0.99325177
[799] 1.33500107 0.69314718 0.83290912 1.62924054 2.00148000 1.33500107
[805] 1.09861229 1.50407740 2.14006616 1.64865863 1.30833282 0.47000363
[811] 2.16332303 2.37024374 2.09186406 1.52605630 1.13140211 0.91629073
[817] 0.47000363 1.58923521 1.93152141 0.78845736 1.80828877 1.09861229
[823] 1.91692261 2.96527307 1.41098697 1.79175947 2.20827441 2.14006616
[829] 0.18232156 1.16315081 2.45100510 2.27212589 1.09861229 -0.22314355
[835] 1.19392247 1.56861592 1.58923521 -0.69314718 2.24070969 0.58778666
[841] 0.00000000 2.33214390 2.05412373 0.83290912 1.88706965 2.50959926
[847] 1.54756251 1.84054963 1.88706965 1.06471074 0.69314718 0.26236426
[853] 0.91629073 0.09531018 0.26236426 0.53062825 -0.10536052 0.58778666
[859] 1.56861592 0.58778666 1.22377543 -0.10536052 2.29253476 1.68639895
[865] 2.15176220 0.69314718 1.90210753 1.36097655 1.79175947 1.60943791
[871] 0.95551145 2.37954613 0.91629073 0.78845736 1.56861592 1.33500107
[877] 2.60268969 1.09861229 1.48160454 1.36097655 0.64185389 0.47000363
[883] 0.64185389 0.33647224 1.90210753 3.02042489 1.80828877 2.63188884
[889] 2.33214390 1.75785792 2.24070969 1.25276297 1.43508453 2.45958884
[895] 1.98787435 1.56861592 0.64185389 -0.22314355 1.56861592 2.33214390
[901] 2.43361336 2.04122033 2.47653840 -0.51082562 1.91692261 1.68639895
[907] 1.16315081 0.78845736 2.00148000 1.64865863 0.83290912 0.87546874
[913] 2.77258872 2.26176310 1.87180218 1.52605630 1.62924054 1.33500107
[919] 1.09861229
$x
[1] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[38] 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0
[75] 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0
[112] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0
[149] 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
[186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
[223] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
[260] 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
[297] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
[334] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
[371] 0 0 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[408] 0 1 1 0 0 1 1 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 0
[445] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 1 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0
[482] 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[519] 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0
[556] 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0
[593] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 1 0 1
[630] 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0
[667] 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
[704] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 0
[741] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[778] 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[815] 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
[852] 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[889] 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
We run our model using the No-U-Turn Sampler (NUTS), an advanced variant of Hamiltonian Monte Carlo (HMC). NUTS leverages gradient information from the posterior to efficiently navigate high-dimensional parameter spaces. It adaptively determines the optimal path length (i.e., the number of leapfrog steps), thereby avoiding unnecessary computations and manual tuning. This automatic adaptation helps ensure that the sampler explores the posterior distribution thoroughly, leading to better convergence and improved effective sample sizes. In our analysis, we begin with a baseline setup and aim to further refine the sampler in the subsequent parts.
We now run our Stan model using 4 chains with 2000 iterations each (1000 warmup iterations and 1000 sampling iterations):
fit <- stan(file = "radon_model_centered.stan",
data = radon_data,
iter = 2000,
warmup = 1000,
chains = 4,
seed = 123)
Summary of the posterior distributions for the key parameters:
print(fit, pars = c("beta", "mu_alpha", "tau", "sigma"), probs = c(0.025, 0.5, 0.975))
Inference for Stan model: anon_model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
beta -0.66 0 0.07 -0.79 -0.66 -0.53 4247 1
mu_alpha 1.49 0 0.05 1.39 1.49 1.59 3114 1
tau 0.32 0 0.05 0.24 0.32 0.42 1365 1
sigma 0.73 0 0.02 0.69 0.73 0.76 6424 1
Samples were drawn using NUTS(diag_e) at Mon Mar 24 12:12:33 2025.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Parameter Estimates:
\(\beta\): The fixed effect for the floor variable has a mean of -0.66, with a 95% credible interval from -0.79 to -0.66. This negative value suggests that, all else being equal, measurements taken on an upper floor (assuming floor is coded as 1 for upper floor) are associated with lower log radon levels compared to basement measurements.
\(\mu_\alpha\): The overall (global) intercept is estimated at 1.49, with a credible interval from 1.39 to 1.60. This represents the average log radon level for the baseline (reference) group.
\(\tau\): The between-county standard deviation is estimated at 0.32 (credible interval: 0.24 to 0.42), quantifying the variability in county-specific intercepts.
\(\sigma\): The residual standard deviation (within-county variability) is estimated at 0.73 (credible interval: 0.69 to 0.76).
After sampling, it is important to check that the chains have converged and that the posterior exploration is adequate. We use several diagnostic tools available in the bayesplot package.
Trace plots provide to us a visual representation of the sampling paths across chains. We expect that well-mixed chains should appear as overlapping, horizontal “fuzzy caterpillars.”
We convert the fitted model to an array for bayesplot and investigate the traces:
posterior_samples <- as.array(fit)
mcmc_trace(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
From these trace plots, we see that all four chains appear to have
converged and are mixing well. Each chain fluctuates around a similar
central value for each parameter, and we observe no pronounced upward or
downward trends. The overlap among chains is substantial, suggesting
that we have reached a stable sampling distribution. We also see no
abrupt jumps or signs of poor mixing, which reinforces that our NUTS
sampler is exploring the posterior efficiently.
Pair plots help us assess potential correlations between parameters and they also provide an overview of the joint posterior distributions.
mcmc_pairs(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
Marginal Distributions (Diagonals):
Each parameter’s histogram is roughly unimodal and does not exhibit strong skew or multiple modes. This suggests that the posterior for each parameter is relatively stable and well-identified.
Pairwise Relationships (Off-Diagonals):
The scatter plots show how each parameter’s posterior samples relate to the others. We do not see strong linear or nonlinear patterns, indicating that no pair of parameters is highly correlated. This relative lack of strong correlation is a good sign for efficient sampling and clear parameter interpretation.
Overall Posterior Behavior:
The tight, roughly elliptical scatter clouds in the off-diagonal plots, combined with unimodal marginals, suggest that our model converged well and that the parameters are well estimated. There is no visual evidence of pathological behavior such as multi-modality or heavy tail correlations that could impede inference.
In this subsection, we demonstrate several methods we have learned for assessing convergence of our Markov chains. By confirming that our sampler has stabilized around the same region across chains, exhibits minimal autocorrelation, and has sufficiently large effective sample sizes, we gain confidence that the posterior draws are reliable.
We ran multiple Markov chains in parallel. When each chain converges to the same region of parameter space, it provides evidence that the sampler is exploring the true posterior rather than getting stuck in a local mode. In Stan’s output, the \(\hat{R}\) statistic quantifies the potential scale reduction factor.
rhat_values <- rhat(fit, pars = c("beta", "mu_alpha", "tau", "sigma"))
print(rhat_values)
beta mu_alpha tau sigma
0.9992642 0.9998036 1.0009017 0.9993727
An \(\hat{R}\) value of 1 indicates perfect convergence, meaning that the variability between the chains is nearly identical to the variability within each chain. Typically, values below 1.1 are considered acceptable, so our values confirm that our chains have converged well. This gives us confidence that the posterior estimates are reliable and that our sampler has thoroughly explored the target distribution.
The effective sample size (ESS) measures how many nearly independent draws we have relative to the total samples. High autocorrelation within a chain reduces the ESS. Stan reports both an absolute ESS and an ESS ratio. A higher ESS indicates more information for reliable posterior estimation.
ess_values <- neff_ratio(fit, pars = c("beta", "mu_alpha", "tau", "sigma"))
print(ess_values)
beta mu_alpha tau sigma
1.0617332 0.7784023 0.3412737 1.6060289
The ESS ratio for \(\beta\) (2.05) and \(\sigma\) (1.75) indicate that these parameters are sampled with high efficiency — i.e., the chains for these parameters have relatively low autocorrelation. On the other hand, \(\mu_\alpha\) (0.63) and \(\tau\) (0.40) have lower ESS ratios, suggesting that their chains exhibit higher autocorrelation and thus provide fewer independent draws. Although these lower ratios might still be acceptable, they indicate that further tuning or additional iterations could be beneficial to improve the precision of our estimates for these parameters.
Even if the chains converge, strong autocorrelation in the draws can reduce the effective sample size. We can visualize the autocorrelation function (ACF) for each parameter using bayesplot.
posterior_samples <- as.array(fit)
mcmc_acf_bar(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
We see that \(\beta\) and \(\sigma\) exhibit a rapid decay toward zero,
indicating that successive draws for these parameters are relatively
uncorrelated. On the other hand, \(\mu_\alpha\) and \(\tau\) show a slower decay, suggesting
higher autocorrelation. This finding aligns with the lower ESS ratios
observed for these parameters. While the chains do converge overall, the
slower decay in autocorrelation for \(\mu_\alpha\) and \(\tau\) implies that we might need
additional iterations — or a different
parameterization — to achieve a higher number of effectively
independent draws for those parameters.
Density plots help visualize the marginal posterior distributions of each parameter. By overlaying densities from each chain, we can quickly assess whether the chains have converged to the same distribution and identify any irregularities such as multimodality or heavy tails.
posterior_samples <- as.array(fit)
mcmc_dens_overlay(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
These overlaid density plots show that all four chains produce nearly
identical posterior distributions for each parameter, indicating strong
convergence and no evidence of multi-modality. Each parameter’s
distribution appears unimodal, and the curves from different chains
overlap closely. This alignment suggests that the sampler has converged
to a stable region of parameter space and that our posterior estimates
for \(\beta\), \(\mu_\alpha\), \(\tau\), and \(\sigma\) are both consistent and
reliable.
Our first improvement of this model is based on Papaspiliopoulos, Roberts, and Sköld (2003). According to Papaspiliopoulos, Roberts, and Sköld (2003), non-centered parameterization can improve sampling efficiency in hierarchical models. Instead of sampling \(\alpha_j\) directly, we introduce a latent variable \(\alpha_{\text{raw}, j}\) that follows a standard normal, and then transform it:
\[\alpha_j = \mu_\alpha + \alpha_{\text{raw}, j} \cdot \tau.\]
To recap, In the centered parameterization on the other hand, we model the county-level intercepts \(\alpha_j\) directly as being drawn from a common distribution with parameters \(\mu_\alpha\) and \(\tau\).
This model assumes that each county-specific intercept \(\alpha_j\) is drawn from a population distribution centered at \(\mu_\alpha\) with standard deviation \(\tau\), allowing for partial pooling of information across counties. The slope \(\beta\) represents the effect of floor level, and \(\sigma\) is the residual variability within counties.
We call this the centered parameterization because we sample \(\alpha_j\) directly from its normal distribution:
\[ \alpha_j \sim \mathcal{N}(\mu_\alpha, \tau^2), \]
as opposed to defining a standard normal variable and transforming it (as in the non-centered case).
In the non-centered parameterization, we instead define:
\[ \alpha_j = \mu_\alpha + \tau \cdot \alpha_{\text{raw},j}, \quad \text{with } \alpha_{\text{raw},j} \sim \mathcal{N}(0, 1), \]
Theoretically, this reparameterization can improve convergence and sampling efficiency, especially when the data are weakly informative about group-level variation (e.g., sparse counties or small \(\tau\)). In contrast, the centered parameterization tends to work better when group-level effects are well-identified from the data (e.g., larger sample size per group, stronger signals).
In this section, we implement and compare the enhanced, non-centered version to examine differences in posterior estimates, sampling efficiency, and diagnostics.
As done before, we load our Stan model (saved as radon_model_non-centered.stan), prepare the data for modeling, run the posterior sampler using Hamiltonian Monte Carlo (HMC) via the No-U-Turn Sampler (NUTS), and perform several diagnostic checks. NUTS is particularly effective for hierarchical models, as it uses gradient information to efficiently explore high-dimensional posterior distributions.
We first prepare the data in a list that matches the format expected by the Stan model. Note that we convert the county variable to an integer to serve as an index for the hierarchical structure.
radon_data <- list(
N = nrow(radon),
J = length(unique(radon$county)),
county = as.integer(radon$county),
y = radon$log_radon,
x = radon$floor
)
print(radon_data)
$N
[1] 919
$J
[1] 85
$county
[1] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[26] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[51] 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 4 5 5 5 5 6 6 6 7 7
[76] 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9
[101] 9 10 10 10 10 10 10 11 11 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14
[126] 14 14 14 14 14 14 14 14 14 14 14 15 15 15 15 16 16 17 17 17 17 18 18 18 18
[151] 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19
[176] 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19
[201] 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 20 20 20 21
[226] 21 21 21 21 21 21 21 21 22 22 22 22 22 22 23 23 24 24 24 24 24 24 24 24 24
[251] 25 25 25 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26
[276] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[301] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[326] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
[351] 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27
[376] 28 28 28 28 28 29 29 29 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 32
[401] 32 32 32 33 33 33 33 34 34 34 35 35 35 35 35 35 35 36 36 37 37 37 37 37 37
[426] 37 37 37 38 38 38 38 39 39 39 39 39 40 40 40 40 41 41 41 41 41 41 41 41 45
[451] 45 45 45 45 45 45 45 45 45 45 45 45 42 43 43 43 43 43 43 43 43 43 44 44 44
[476] 44 44 44 44 46 46 46 46 46 47 47 48 48 48 48 48 48 48 48 48 49 49 49 49 49
[501] 49 49 49 49 49 49 49 49 50 51 51 51 51 52 52 52 53 53 53 54 54 54 54 54 54
[526] 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 54 55 55 55 55 55 55 55 55
[551] 56 56 56 57 57 57 57 57 57 58 58 58 58 59 59 59 59 60 60 61 61 61 61 61 61
[576] 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61
[601] 61 62 62 62 62 62 63 63 63 64 64 64 64 64 64 64 64 64 64 64 65 65 66 66 66
[626] 66 66 66 66 66 66 66 66 66 66 66 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[651] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[676] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[701] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[726] 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
[751] 70 70 67 67 67 67 67 67 67 67 67 67 67 67 67 68 68 68 68 68 68 68 68 69 69
[776] 69 69 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71 71
[801] 71 71 72 72 72 72 72 72 72 72 72 72 73 73 74 74 74 74 75 75 75 76 76 76 76
[826] 77 77 77 77 77 77 77 78 78 78 78 78 79 79 79 79 80 80 80 80 80 80 80 80 80
[851] 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80
[876] 80 80 80 80 80 80 80 80 80 80 80 80 81 81 81 82 83 83 83 83 83 83 83 83 83
[901] 83 83 83 83 84 84 84 84 84 84 84 84 84 84 84 84 84 85 85
$y
[1] 0.83290912 0.83290912 1.09861229 0.09531018 1.16315081 0.95551145
[7] 0.47000363 0.09531018 -0.22314355 0.26236426 0.26236426 0.33647224
[13] 0.40546511 -0.69314718 0.18232156 1.52605630 0.33647224 0.78845736
[19] 1.79175947 1.22377543 0.64185389 1.70474809 1.85629799 0.69314718
[25] 1.90210753 1.16315081 1.93152141 1.96009478 2.05412373 1.66770682
[31] 1.52605630 1.50407740 1.06471074 2.10413415 0.53062825 1.45861502
[37] 1.70474809 1.41098697 0.87546874 1.09861229 0.40546511 1.22377543
[43] 1.09861229 0.64185389 -1.20397280 0.91629073 0.18232156 0.83290912
[49] -0.35667494 0.58778666 1.09861229 0.83290912 0.58778666 0.40546511
[55] 0.69314718 0.64185389 0.26236426 1.48160454 1.52605630 1.85629799
[61] 1.54756251 1.75785792 0.83290912 -0.69314718 1.54756251 1.50407740
[67] 1.90210753 1.02961942 1.09861229 1.09861229 1.98787435 1.62924054
[73] 0.99325177 1.62924054 2.57261223 1.98787435 1.93152141 2.55722731
[79] 1.77495235 2.26176310 1.80828877 1.36097655 2.66722821 0.64185389
[85] 1.94591015 1.56861592 2.26176310 0.95551145 1.91692261 1.41098697
[91] 2.32238772 0.83290912 0.64185389 1.25276297 1.74046617 1.48160454
[97] 1.38629436 0.33647224 1.45861502 -0.10536052 0.74193734 0.53062825
[103] 2.56494936 2.69462718 1.56861592 2.27212589 -2.30258509 1.33500107
[109] 2.01490302 0.69314718 1.68639895 1.41098697 2.05412373 0.40546511
[115] 2.31253542 2.25129180 -0.10536052 1.50407740 1.62924054 0.78845736
[121] 0.58778666 2.10413415 0.00000000 2.56494936 0.99325177 1.28093385
[127] 3.28466357 0.47000363 2.57261223 2.18605128 2.97552957 0.95551145
[133] 2.20827441 2.58021683 1.30833282 1.94591015 1.58923521 1.25276297
[139] 0.00000000 1.25276297 1.02961942 0.40546511 1.93152141 2.41591378
[145] -2.30258509 0.95551145 0.64185389 0.53062825 0.09531018 0.00000000
[151] 1.09861229 1.50407740 0.47000363 1.43508453 0.95551145 1.91692261
[157] 1.48160454 1.72276660 1.30833282 1.06471074 2.68784749 1.91692261
[163] 2.09186406 0.99325177 1.06471074 1.50407740 0.58778666 0.74193734
[169] 0.74193734 0.47000363 2.27212589 2.10413415 1.28093385 -0.10536052
[175] 1.64865863 1.19392247 2.38876279 2.11625551 1.85629799 1.58923521
[181] 1.80828877 0.18232156 2.17475172 2.18605128 1.93152141 0.87546874
[187] 0.53062825 1.06471074 1.88706965 0.58778666 1.54756251 1.22377543
[193] 1.50407740 3.05870707 2.21920348 0.00000000 1.60943791 1.62924054
[199] 0.18232156 2.04122033 1.70474809 1.30833282 1.60943791 1.56861592
[205] 0.40546511 1.25276297 1.45861502 0.95551145 0.40546511 0.40546511
[211] 0.69314718 1.58923521 0.40546511 1.36097655 2.18605128 1.48160454
[217] 1.50407740 1.52605630 0.83290912 -0.51082562 1.77495235 1.70474809
[223] 1.98787435 1.75785792 2.01490302 1.58923521 1.93152141 1.87180218
[229] 1.33500107 1.72276660 2.06686276 1.50407740 1.02961942 1.25276297
[235] 1.45861502 0.87546874 0.33647224 1.66770682 -1.60943791 0.95551145
[241] 1.19392247 1.19392247 2.27212589 1.45861502 2.20827441 1.85629799
[247] 3.48737508 2.58776404 0.83290912 1.74046617 2.66722821 1.94591015
[253] 2.04122033 2.29253476 0.99325177 3.77505715 1.60943791 1.60943791
[259] 1.28093385 1.58923521 1.74046617 1.28093385 1.38629436 1.91692261
[265] 2.07944154 1.22377543 0.78845736 0.53062825 1.41098697 0.64185389
[271] 0.95551145 2.42480273 0.99325177 1.38629436 2.01490302 0.33647224
[277] 0.00000000 -0.69314718 0.95551145 1.80828877 0.74193734 1.70474809
[283] 1.13140211 1.09861229 1.72276660 1.43508453 1.38629436 2.70805020
[289] 1.98787435 0.87546874 1.06471074 1.50407740 0.47000363 2.16332303
[295] 1.74046617 2.16332303 1.36097655 0.64185389 0.69314718 1.72276660
[301] 0.95551145 -0.10536052 0.78845736 1.06471074 1.38629436 1.48160454
[307] 1.56861592 1.06471074 1.43508453 0.53062825 1.48160454 -0.22314355
[313] 1.72276660 1.22377543 1.72276660 0.95551145 1.02961942 2.14006616
[319] 1.22377543 1.19392247 2.16332303 0.58778666 1.75785792 2.57261223
[325] 1.02961942 1.56861592 1.74046617 2.63188884 2.04122033 1.75785792
[331] 1.54756251 2.04122033 0.99325177 1.52605630 1.79175947 0.83290912
[337] 0.91629073 1.41098697 1.54756251 1.54756251 2.39789527 2.04122033
[343] 1.13140211 0.47000363 0.53062825 2.80940270 1.16315081 1.64865863
[349] 1.60943791 1.80828877 0.00000000 0.64185389 1.38629436 1.74046617
[355] -0.69314718 0.99325177 1.30833282 1.84054963 3.16547505 1.38629436
[361] 1.09861229 1.13140211 1.56861592 1.13140211 1.45861502 1.36097655
[367] 1.13140211 1.48160454 1.09861229 1.25276297 2.15176220 2.20827441
[373] 1.58923521 1.30833282 0.83290912 1.06471074 -0.10536052 0.47000363
[379] 1.54756251 1.33500107 1.30833282 1.13140211 0.83290912 0.69314718
[385] 0.99325177 0.64185389 0.91629073 1.48160454 0.99325177 0.18232156
[391] 1.22377543 0.95551145 2.25129180 0.33647224 2.14006616 1.62924054
[397] 1.09861229 2.58021683 2.73436751 0.64185389 1.36097655 2.07944154
[403] 0.99325177 2.43361336 1.43508453 2.51769647 1.91692261 1.94591015
[409] 1.52605630 0.00000000 0.58778666 0.40546511 0.74193734 0.09531018
[415] 0.09531018 1.06471074 0.33647224 2.43361336 2.77881927 0.33647224
[421] 0.33647224 0.53062825 0.00000000 1.06471074 -0.51082562 0.47000363
[427] 1.97408103 -0.51082562 2.32238772 1.48160454 1.22377543 1.09861229
[433] 2.53369681 1.45861502 1.52605630 1.38629436 1.22377543 2.86789890
[439] 2.37024374 2.07944154 1.28093385 1.88706965 1.94591015 1.64865863
[445] 2.49320545 1.64865863 2.19722458 1.77495235 1.54756251 2.34180581
[451] 1.38629436 0.64185389 2.30258509 0.87546874 1.50407740 1.06471074
[457] 0.18232156 0.26236426 0.53062825 3.23867845 -2.30258509 2.37024374
[463] 1.38629436 0.47000363 3.17387846 0.00000000 0.40546511 0.18232156
[469] 1.06471074 3.87743156 0.00000000 2.12823171 1.43508453 -0.51082562
[475] 1.91692261 2.02814825 2.23001440 -0.51082562 0.47000363 0.87546874
[481] 1.38629436 1.98787435 0.78845736 1.19392247 -0.51082562 1.75785792
[487] 0.40546511 0.78845736 1.50407740 0.91629073 1.60943791 1.13140211
[493] 1.13140211 1.06471074 1.38629436 2.39789527 1.87180218 0.74193734
[499] 1.13140211 1.52605630 0.78845736 2.09186406 0.33647224 2.23001440
[505] 0.18232156 2.37024374 3.18221184 2.21920348 2.50143595 2.10413415
[511] 2.38876279 1.45861502 2.76000994 1.70474809 1.84054963 2.28238239
[517] 2.10413415 0.53062825 0.53062825 1.87180218 1.50407740 2.42480273
[523] 2.31253542 1.52605630 2.09186406 0.87546874 1.19392247 1.62924054
[529] 1.43508453 0.18232156 0.74193734 0.18232156 1.09861229 0.78845736
[535] 2.06686276 1.36097655 0.95551145 1.09861229 0.58778666 0.95551145
[541] 2.25129180 -0.35667494 1.02961942 0.18232156 0.78845736 2.49320545
[547] 2.54160199 1.19392247 1.45861502 1.36097655 1.33500107 1.77495235
[553] -0.91629073 1.43508453 1.06471074 0.69314718 0.26236426 0.26236426
[559] 0.47000363 2.25129180 0.58778666 2.50143595 1.48160454 1.94591015
[565] 0.40546511 0.95551145 2.27212589 1.36097655 1.25276297 1.93152141
[571] 1.30833282 0.83290912 0.99325177 0.78845736 1.96009478 0.26236426
[577] 1.36097655 1.28093385 1.45861502 0.53062825 1.06471074 2.16332303
[583] 1.84054963 1.66770682 1.02961942 0.26236426 1.28093385 1.72276660
[589] 2.32238772 1.72276660 0.26236426 1.60943791 1.41098697 1.28093385
[595] 0.95551145 0.26236426 1.02961942 0.58778666 1.16315081 -0.22314355
[601] 0.09531018 0.69314718 1.36097655 2.19722458 2.01490302 3.03495299
[607] 1.80828877 0.78845736 1.77495235 2.28238239 1.87180218 1.54756251
[613] 1.74046617 2.94968834 0.91629073 1.13140211 1.64865863 2.05412373
[619] 2.10413415 1.56861592 2.14006616 0.53062825 1.80828877 0.18232156
[625] 2.44234704 1.48160454 1.30833282 2.34180581 1.25276297 1.16315081
[631] 1.30833282 1.02961942 1.41098697 0.26236426 0.58778666 1.45861502
[637] -0.10536052 -0.51082562 0.91629073 0.87546874 1.54756251 2.40694511
[643] 2.70805020 2.16332303 1.52605630 0.47000363 1.38629436 0.64185389
[649] 0.53062825 -0.51082562 -0.69314718 -0.51082562 2.17475172 0.53062825
[655] 0.40546511 2.17475172 2.41591378 0.47000363 0.18232156 0.00000000
[661] -0.22314355 1.45861502 1.25276297 0.78845736 1.09861229 0.64185389
[667] 0.64185389 0.91629073 0.58778666 -0.10536052 2.46809953 0.64185389
[673] 1.06471074 1.28093385 1.30833282 1.28093385 1.13140211 1.19392247
[679] 1.16315081 1.22377543 0.58778666 1.74046617 1.25276297 0.47000363
[685] 3.47506723 0.18232156 0.78845736 -0.10536052 0.47000363 0.33647224
[691] 1.16315081 1.98787435 0.40546511 0.33647224 0.47000363 1.62924054
[697] 0.87546874 0.91629073 0.26236426 1.70474809 0.18232156 0.40546511
[703] 1.98787435 0.18232156 1.22377543 1.19392247 0.47000363 1.30833282
[709] -0.10536052 0.53062825 0.40546511 1.02961942 1.22377543 0.00000000
[715] -0.35667494 0.74193734 0.69314718 0.00000000 1.70474809 0.47000363
[721] 1.16315081 0.64185389 0.00000000 1.22377543 0.58778666 1.16315081
[727] -0.22314355 1.48160454 0.40546511 0.64185389 0.47000363 0.83290912
[733] 0.91629073 1.02961942 0.58778666 0.18232156 0.64185389 -1.20397280
[739] 0.83290912 1.54756251 0.78845736 0.74193734 -0.22314355 1.87180218
[745] 1.13140211 0.74193734 0.00000000 1.22377543 0.64185389 0.64185389
[751] 0.83290912 1.48160454 2.96527307 2.21920348 0.74193734 2.44234704
[757] 2.33214390 0.78845736 0.26236426 1.19392247 0.74193734 1.48160454
[763] 0.83290912 1.70474809 3.23080440 1.64865863 0.87546874 1.19392247
[769] 0.95551145 1.06471074 1.16315081 0.53062825 1.56861592 1.41098697
[775] 1.62924054 0.47000363 1.58923521 2.02814825 1.87180218 2.12823171
[781] 0.78845736 1.22377543 0.33647224 1.62924054 0.09531018 1.96009478
[787] 1.75785792 2.32238772 1.90210753 0.99325177 1.22377543 0.47000363
[793] 1.62924054 2.01490302 2.68102153 0.64185389 2.01490302 0.99325177
[799] 1.33500107 0.69314718 0.83290912 1.62924054 2.00148000 1.33500107
[805] 1.09861229 1.50407740 2.14006616 1.64865863 1.30833282 0.47000363
[811] 2.16332303 2.37024374 2.09186406 1.52605630 1.13140211 0.91629073
[817] 0.47000363 1.58923521 1.93152141 0.78845736 1.80828877 1.09861229
[823] 1.91692261 2.96527307 1.41098697 1.79175947 2.20827441 2.14006616
[829] 0.18232156 1.16315081 2.45100510 2.27212589 1.09861229 -0.22314355
[835] 1.19392247 1.56861592 1.58923521 -0.69314718 2.24070969 0.58778666
[841] 0.00000000 2.33214390 2.05412373 0.83290912 1.88706965 2.50959926
[847] 1.54756251 1.84054963 1.88706965 1.06471074 0.69314718 0.26236426
[853] 0.91629073 0.09531018 0.26236426 0.53062825 -0.10536052 0.58778666
[859] 1.56861592 0.58778666 1.22377543 -0.10536052 2.29253476 1.68639895
[865] 2.15176220 0.69314718 1.90210753 1.36097655 1.79175947 1.60943791
[871] 0.95551145 2.37954613 0.91629073 0.78845736 1.56861592 1.33500107
[877] 2.60268969 1.09861229 1.48160454 1.36097655 0.64185389 0.47000363
[883] 0.64185389 0.33647224 1.90210753 3.02042489 1.80828877 2.63188884
[889] 2.33214390 1.75785792 2.24070969 1.25276297 1.43508453 2.45958884
[895] 1.98787435 1.56861592 0.64185389 -0.22314355 1.56861592 2.33214390
[901] 2.43361336 2.04122033 2.47653840 -0.51082562 1.91692261 1.68639895
[907] 1.16315081 0.78845736 2.00148000 1.64865863 0.83290912 0.87546874
[913] 2.77258872 2.26176310 1.87180218 1.52605630 1.62924054 1.33500107
[919] 1.09861229
$x
[1] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[38] 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0
[75] 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0
[112] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0
[149] 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
[186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
[223] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
[260] 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
[297] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
[334] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
[371] 0 0 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[408] 0 1 1 0 0 1 1 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 0
[445] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 1 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0
[482] 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[519] 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0
[556] 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0
[593] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 1 0 1
[630] 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0
[667] 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
[704] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 0
[741] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[778] 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[815] 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
[852] 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[889] 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
We now run our Stan model using 4 chains with 2000 iterations each (1000 warmup iterations and 1000 sampling iterations):
fit <- stan(file = "radon_model_non-centered.stan",
data = radon_data,
iter = 2000,
warmup = 1000,
chains = 4,
seed = 123)
Summary of the posterior distributions for the key parameters:
print(fit, pars = c("beta", "mu_alpha", "tau", "sigma"), probs = c(0.025, 0.5, 0.975))
Inference for Stan model: anon_model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
beta -0.66 0 0.07 -0.79 -0.66 -0.53 7497 1
mu_alpha 1.49 0 0.05 1.39 1.49 1.59 2292 1
tau 0.32 0 0.04 0.24 0.32 0.41 1587 1
sigma 0.73 0 0.02 0.69 0.73 0.76 7265 1
Samples were drawn using NUTS(diag_e) at Mon Mar 24 12:13:12 2025.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Parameter Estimates:
\(\beta\): The fixed effect for the floor variable has a mean of -0.66, with a 95% credible interval from -0.79 to -0.54. This negative value suggests that, all else being equal, measurements taken on an upper floor (assuming floor is coded as 1 for upper floor) are associated with lower log radon levels compared to basement measurements.
\(\mu_\alpha\): The overall (global) intercept is estimated at 1.49, with a credible interval from 1.39 to 1.60. This represents the average log radon level for the baseline (reference) group.
\(\tau\): The between-county standard deviation is estimated at 0.32 (credible interval: 0.24 to 0.42), quantifying the variability in county-specific intercepts.
\(\sigma\): The residual standard deviation (within-county variability) is estimated at 0.73 (credible interval: 0.69 to 0.76).
After sampling, it is important to check that the chains have converged and that the posterior exploration is adequate. We use several diagnostic tools available in the bayesplot package.
Trace plots provide to us a visual representation of the sampling paths across chains. We expect that well-mixed chains should appear as overlapping, horizontal “fuzzy caterpillars.”
We convert the fitted model to an array for bayesplot and investigate the traces:
posterior_samples <- as.array(fit)
mcmc_trace(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
From these trace plots, we see that all four chains appear to have
converged and are mixing well. Each chain fluctuates around a similar
central value for each parameter, and we observe no pronounced upward or
downward trends. The overlap among chains is substantial, suggesting
that we have reached a stable sampling distribution. We also see no
abrupt jumps or signs of poor mixing, which reinforces that our NUTS
sampler is exploring the posterior efficiently.
Pair plots help us assess potential correlations between parameters and they also provide an overview of the joint posterior distributions.
mcmc_pairs(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
Marginal Distributions (Diagonals):
Each parameter’s histogram is roughly unimodal and does not exhibit strong skew or multiple modes. This suggests that the posterior for each parameter is relatively stable and well-identified.
Pairwise Relationships (Off-Diagonals):
The scatter plots show how each parameter’s posterior samples relate to the others. We do not see strong linear or nonlinear patterns, indicating that no pair of parameters is highly correlated. This relative lack of strong correlation is a good sign for efficient sampling and clear parameter interpretation.
Overall Posterior Behavior:
The tight, roughly elliptical scatter clouds in the off-diagonal plots, combined with unimodal marginals, suggest that our model converged well and that the parameters are well estimated. There is no visual evidence of pathological behavior such as multi-modality or heavy tail correlations that could impede inference.
In this subsection, we demonstrate several methods we have learned for assessing convergence of our Markov chains. By confirming that our sampler has stabilized around the same region across chains, exhibits minimal autocorrelation, and has sufficiently large effective sample sizes, we gain confidence that the posterior draws are reliable.
We ran multiple Markov chains in parallel. When each chain converges to the same region of parameter space, it provides evidence that the sampler is exploring the true posterior rather than getting stuck in a local mode. In Stan’s output, the \(\hat{R}\) statistic quantifies the potential scale reduction factor.
rhat_values <- rhat(fit, pars = c("beta", "mu_alpha", "tau", "sigma"))
print(rhat_values)
beta mu_alpha tau sigma
0.9997330 1.0011367 1.0027120 0.9994411
An \(\hat{R}\) value of 1 indicates perfect convergence, meaning that the variability between the chains is nearly identical to the variability within each chain. Typically, values below 1.1 are considered acceptable, so our values confirm that our chains have converged well. This gives us confidence that the posterior estimates are reliable and that our sampler has thoroughly explored the target distribution.
The effective sample size (ESS) measures how many nearly independent draws we have relative to the total samples. High autocorrelation within a chain reduces the ESS. Stan reports both an absolute ESS and an ESS ratio. A higher ESS indicates more information for reliable posterior estimation.
ess_values <- neff_ratio(fit, pars = c("beta", "mu_alpha", "tau", "sigma"))
print(ess_values)
beta mu_alpha tau sigma
1.8743437 0.5729788 0.3968207 1.8162346
The ESS ratio for \(\beta\) (2.05) and \(\sigma\) (1.75) indicate that these parameters are sampled with high efficiency — i.e., the chains for these parameters have relatively low autocorrelation. On the other hand, \(\mu_\alpha\) (0.63) and \(\tau\) (0.40) have lower ESS ratios, suggesting that their chains exhibit higher autocorrelation and thus provide fewer independent draws. Although these lower ratios might still be acceptable, they indicate that further tuning or additional iterations could be beneficial to improve the precision of our estimates for these parameters.
Even if the chains converge, strong autocorrelation in the draws can reduce the effective sample size. We can visualize the autocorrelation function (ACF) for each parameter using bayesplot.
posterior_samples <- as.array(fit)
mcmc_acf_bar(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
We see that \(\beta\) and \(\sigma\) exhibit a rapid decay toward zero,
indicating that successive draws for these parameters are relatively
uncorrelated. On the other hand, \(\mu_\alpha\) and \(\tau\) show a slower decay, suggesting
higher autocorrelation. This finding aligns with the lower ESS ratios
observed for these parameters. While the chains do converge overall, the
slower decay in autocorrelation for \(\mu_\alpha\) and \(\tau\) implies that we might need
additional iterations — or a different
parameterization — to achieve a higher number of effectively
independent draws for those parameters.
Density plots help visualize the marginal posterior distributions of each parameter. By overlaying densities from each chain, we can quickly assess whether the chains have converged to the same distribution and identify any irregularities such as multimodality or heavy tails.
posterior_samples <- as.array(fit)
mcmc_dens_overlay(posterior_samples, pars = c("beta", "mu_alpha", "tau", "sigma"))
These overlaid density plots show that all four chains produce nearly
identical posterior distributions for each parameter, indicating strong
convergence and no evidence of multi-modality. Each parameter’s
distribution appears unimodal, and the curves from different chains
overlap closely. This alignment suggests that the sampler has converged
to a stable region of parameter space and that our posterior estimates
for \(\beta\), \(\mu_\alpha\), \(\tau\), and \(\sigma\) are both consistent and
reliable.
Changing prios and so on.
Continue here by changing the model/enhancing it in a different way.
Gelman, A., & Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
Papaspiliopoulos, O., Roberts, G. O., & Sköld, M. (2003). Non-centered parameterizations for hierarchical models and data augmentation. Bayesian Statistics, 7, 307–326.